基於使用者偏好之影片快轉模型 User-preference-based Video Fast-forwarding Model

نویسندگان

Sheng-Jie Luo

Bing-Yu Chen

چکیده

In this thesis we propose a new video interaction model called adaptive fast-forwarding to help people quickly browse videos with predefined semantic rules. This model is designed around the metaphor of scenic car driving, in which the driver slows down near areas of interest and speeds through unexciting areas. Results from a preliminary user study of our video player suggest the following: • The player should adaptively adjust the current playback speed based on the complexity of the present scene and predefined semantic events. • The player should learn user preferences about predefined event types as well as a suitable playback speed. • The player should fast-forward the video continuously with a playback rate acceptable to the user to avoid missing any undefined events or areas of interest. Furthermore, we provide the absolute speed control model with the analog controller to enhance the experience of speed control. Our user study results suggest that for certain types of video, our system SmartPlayer yields better user experiences in browsing and fast-forwarding videos than existing video players’ interaction models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

基於聽覺感知模型之類神經網路及其在語者識別上之應用 (Two-stage Attentional Auditory Model Inspired Neural Network and Its Application to Speaker Identification) [In Chinese]

根據神經生理學研究,耳朵會針對聲音的各個頻率進行分頻,並產生出聽覺頻譜,研究人員根據專注聽覺現象和生物聽覺實驗,也發現了大腦聽覺皮質上神經作用的模式。於本論文中, 我們運用類神經網路,建構出一種模擬人類聽覺的類神經網路模型,並在語者識別這個應用上進行討論,期望能成功連結神經生理學的知識與工程的技術。而我們所設計的模型,是利用兩層不同維度的卷積神經網路(Convolutional Neural Network),分別模擬初期耳蝸階段及大腦皮質階段,透過設計卷積核初始值,即耳蝸階段多組一維分頻濾波器和大腦皮質階段同時解析時頻資訊的二維濾波器,以使模型能夠快速地達到收斂狀態。而透過模型訓練,根據目的與環境變因的不同,模型會自動調整其中參數,使輸入資料映射至目標的型態。同時我們也針對所提出的模型架構,進行了多種形態的比較,進而發現在給定初始值的狀況下,即使訓練不夠充分, 也能產...

متن کامل

基於稀疏表示之語者識別 (Sparse Representation Based Speaker Identification) [In Chinese]

稀疏表示分類器(Sparse Representation Classifier, SRC)是一種基於影像稀疏表示 (Sparse Representation)的機器學習方法。在影像以及人臉辨識上的研究上,稀疏表示分類器具有非常好的辨識效果以及強健性。有鑑於 SRC 在影像辨識上的高鑑別能力,近幾年已有許多基於稀疏表示的語者識別(Speaker Identification)方法相繼被提出。本論文提出一套基於稀疏表示的辨識系統,我們提出以機率型主成份分析 (Probabilistic Principle Component Analysis, PPCA)建構超級向量(Supervector),並加入檢定的方式調整特徵值選取,使語者高斯混合模型(Gaussian Mixture Model, GMM)中每個高斯的維度可以針對資料的不同作調整。接著,我們在稀疏字典上加強,透過...

متن کامل

主題語言模型於大詞彙連續語音辨識之研究 (On the Use of Topic Models for Large-Vocabulary Continuous Speech Recognition) [In Chinese]

本論文研究使用主題資訊之語言模型(Language Model)。當語言模型用於大詞彙連續語音辨識時,其主要的任務是藉由已解碼歷史詞序列資訊來預測下一個候選詞出現的可能性。傳統的 N 連(N-gram)語言模型容易受限於模型參數過多的問題,僅能用來擷取短距離的詞彙接連資訊,並不能考慮完整的歷史詞序列之語意資訊。因此,近十幾年來許多研究學者陸續提出各式主題模型(Topic Model),包括討論文件與詞之關係的機率式潛藏語意分析(Probabilistic Latent Semantic Analysis, PLSA)和潛藏狄利克里分配(Latent Dirichlet Allocation, LDA),以及討論詞虛擬文件與詞關係的詞主題模型(Word Topic Model, WTM)。這些模型主要都是透過一組潛藏的主題機率分布來描述文件與詞、或者詞虛擬文件與詞之間的關係...

متن کامل

運用概念模型化技術於中文大詞彙連續語音辨識之語言模型調適 (Leveraging Concept Modeling Techniques for Language Model Adaptation in Mandarin Large Vocabulary Continuous Speech Recognition) [In Chinese]

在實作上,概念模型會使用(搜尋)與初步語音辨識結果相關的同領域文件(或調適語料)內表述的若干概念,用以近似語者內心欲傳達的真正含意,並基於此來建立概念語言模型。而概念語言模型的建立是分兩個面向來探討,它們分別是「詞彙」面向與「文件群聚」面向。首先,在實作上,概念模型會使用(搜尋)與初步語音辨識結果近似同領域文件(或調適語料)內表述的若干概念,用以近似語者內心欲傳達的真正含意,並基於此來建立概念語言模型。而概念語言模型的建立是分兩個面向來探討,它們分別是「詞彙」面向與「文件群聚」面向。首先, 我們發展所謂的詞概念語言模型(Word-based Concept Language Model),並應用於語言模型調適。在建構詞概念語言模型時,我們期望能夠針對每一語句不同的語意內容(第一階段語音辨識結果,以詞圖[3]表示),在調適語料的若干相關的文件中挑選一組具有代表性的概念...

متن کامل

Stacking Heterogeneous Joint Models of Chinese POS Tagging and Dependency Parsing

Previous joint models of Chinese part-of-speech (POS) tagging and dependency parsing are extended from either graphor transition-based dependency models. Our analysis shows that the two models have different error distributions. In addition, integration of graphand transition-based dependency parsers by stacked learning (stacking) has achieved significant improvements. These motivate us to stud...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

基於使用者偏好之影片快轉模型 User-preference-based Video Fast-forwarding Model

نویسندگان

چکیده

منابع مشابه

基於聽覺感知模型之類神經網路及其在語者識別上之應用 (Two-stage Attentional Auditory Model Inspired Neural Network and Its Application to Speaker Identification) [In Chinese]

基於稀疏表示之語者識別 (Sparse Representation Based Speaker Identification) [In Chinese]

主題語言模型於大詞彙連續語音辨識之研究 (On the Use of Topic Models for Large-Vocabulary Continuous Speech Recognition) [In Chinese]

運用概念模型化技術於中文大詞彙連續語音辨識之語言模型調適 (Leveraging Concept Modeling Techniques for Language Model Adaptation in Mandarin Large Vocabulary Continuous Speech Recognition) [In Chinese]

Stacking Heterogeneous Joint Models of Chinese POS Tagging and Dependency Parsing

عنوان ژورنال:

اشتراک گذاری